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We study the graph coloring problem over random graphs of finite average connectivity c. Given 
a number q of available colors, we find that graphs with low connectivity admit almost always a 
proper coloring whereas graphs with high connectivity are uncolorable. Depending on q, we find the 
precise value of the critical average connectivity c q . Moreover, we show that below c q there exist a 
clustering phase c £ [cd, c q ] in which ground states spontaneously divide into an exponential number 
of clusters. Furthermore, we extended our considerations to the case of single instances showing 
consistent results. This lead us to propose a new algorithm able to color in polynomial time random 
graphs in the hard but colorable region, i.e when c £ [c^c^]. 

PACS numbers: 89.20.Ff, 75.10.Nr, 05.70.Fh, 02.70.-c 

I. INTRODUCTION 

The Graph Coloring problem is a very basic problem in combinatorics Q and in statistical physics 0. Given a 
graph, or a lattice, and given a number q of available colors, the problem consists in finding a coloring of all vertices 
such that no edge has the two ending vertices of the same color. The minimally needed number of colors is the 
chromatic number of the graph. 

For planar graphs there exists a famous theorem Q showing that four colors are sufficient, and that a coloring 
can be found by an efficient algorithm, while for general graphs the problem is computationally hard to solve. In 
1972 it was shown that Graph Coloring is NP-complete |j] which means, roughly speaking, that the time required 
for determining the existence of a proper coloring grows exponentially with the graph size. On the other hand if an 
efficient algorithm for solving coloring in its worst case instances exists, the same algorithm up polynomial-reduction 
can be applied to efficiently solve all other problems in the class NP (for a physicist's approach to complexity theory 
see @). 

In modern computer science, graph coloring is taken as one of the most widely used benchmarks for the evaluation 
of algorithm performance The interest in coloring stems from the fact that many real- world combinatorial 
optimization problems have component sub-problems which can be easily represented as coloring problems. For 
instance, a classical application is the scheduling of registers in the central processing unit of computers 0. All 
variables manipulated by the program are characterized by ranges of times during which their values are left unchanged. 
Any two variables that change during the same time interval cannot be stored in the same register. One may represent 
the overall computation by constructing a graph where each variable is associated with a vertex and edges are placed 
between any two vertices whose corresponding variables change during the same time interval. A proper coloring with 
a minimal number of colors of this graph provides an optimal scheduling for registers: two variables with the same 
color will not be connected by an edge and so can be assigned to the same register (since they change in different 
time intervals). 
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The g-coloring problem of random graphs represents a very active field of research in discrete mathematics which 
constitutes the natural evolution of the percolation theory initiated by Erdos and Renyi in the 50's [8|. One point 
of contact between computer science and random graph theory arises from the observation that, for large random 
graphs, there exists a critical average connectivity beyond which the graphs become uncolorable by q colors with 
probability tending to one as the graph size goes to infinity. This transition will be called the g-COL/UNCOL 
transition throughout this paper. The precise value of the critical connectivity depends of course on the number q of 
allowed colors and on the ensemble of random graphs under consideration. Graphs generated close to their critical 
connectivity are extraordinarily hard to color and therefore the study of critical instances is at the same time a well 
posed mathematical question as well as an algorithmic challenge for the understanding of the onset of computational 
complexity ^| . The notion of computational complexity refers to worst-case instances and therefore results for a 
given ensemble of problems might not be of direct relevance. However, on the more practical side, algorithms which 
are used to solve real-world problems display a huge variability of running times and a theory for their typical-case 
behavior, on classes of non-trivial random instances, constitutes the natural complement to the worst-case analysis. 
Similarly to what happens for other very famous combinatorial problem, e.g. the satisfiability problem of Boolean 
formulas, critical random instances of g-coloring are a popular test-bed for the performance of search algorithms 

From the physics side g-coloring has a direct interpretation as a spin-glass model . A proper coloring of a graph 
is simply a zero-energy ground state configuration of a Potts anti-ferromagnet with g-state variables. For most lattices 
such a system is frustrated and displays all the equilibrium and out-of-equilibrium features of spin glasses (the 'Potts 
glass'). 

Here we focus on the g-coloring problem (or Potts anti-ferromagnet) over random graphs of finite average connec- 
tivity, given by the Q(N,p) ensemble: Graphs are composed of iV vertices, every pair of them independently being 
connected by an edge with probability p, and being not directly connected with probability 1 — p. The relevant case 
of finite connectivity graphs is described by p = c/N, with c staying constant in the thermodynamic limit TV — > oo. In 
this case, the expected number of edges becomes M = c/N( 2 ) ~ cN/2, i.e. proportional to the vertex number. Each 
of the vertices is, on average, connected to c other vertices. This connectivity fluctuates according to a Poissonian, 
i.e. the probability of randomly selecting a vertex with exactly d neighbors is given by e~ c c d /dl. 

Two types of questions can be asked. One type is algorithmic, i.e. finding an algorithm that decides whether a 
given graph is colorable. The other type is more theoretical and amounts to asking whether a typical problem instance 
is colorable or not and what the typical structure of the solution space is. Here wc address both questions using the 
so called cavity method |l2| . First we provide a detailed description of the analytical calculations beyond the results 
presented in [13j, where the question of the coloring threshold and of typical solution properties were addressed. 
Second this analytical description is modified and applied to single graph instances. This leads to an efficient graph 
coloring algorithm for the region slightly below the COL/UNCOL transition. In this region, known complete and 
stochastic algorithms are known to fail already for moderate system sizes. 

Let us start with reviewing some known results on the g-COL/UNCOL transition on random graphs. One of 
the first important finite-connectivity results was obtained by Luczak about one decade ago [l4|- He proved that the 
threshold asymptotically grows like c q ~ 2glng for large numbers of colors, a result, which up to a pre-factor coincides 
with the outcome of a replica calculation on highly connected graphs (p = 0(1) for large N). For fixed number 
q of colors, all vertices with less than q neighbors, i.e. of degree smaller q, can be colored for sure. The hardest to 
color structure is thus given by the maximal subgraph having minimal degree at least g, the so-called g-core. Pittel, 
Spencer and Wormald 16] showed that the emergence of a 2-core coincides with the percolation transition of random 
graphs at c = 1 0|, and is continuous. For q > 3, however, the g-core arises discontinuously, jumping from zero to 
a finite fraction of the full graph. For q — 3 they found e.g. that the core emerges at c ~ 3.35 and immediately 
contains about 27% of all vertices. Shortly after, it was realized that the existence of the core is necessary, but in 
no way sufficient for uncolorability 0. In fact, the best lower bound for the 3-COL/UNCOL transition is 4.03 
and numerical results predict a threshold of about 4.7 [l^. The currently best rigorous upper bound is 4.99 |20j . 
It was obtained using a refined first moment method. In statistical mechanics, the latter is known as the annealed 
approximation. More recently, a replica symmetric analysis of the problem has been performed pl|. The resulting 
threshold 5.1 exceeds, however, the rigorous bound, and one has to go beyond replica symmetry. At the level of 
one-step replica-symmetry breaking we are able to calculate a threshold value C3 ~ 4.69 which we believe to be exact. 
We also describe the solution space structure which undergoes a clustering transition at Cd — 4.42. 

The remaining of the paper is organized as follows: In section^we present the replica symmetric (RS) solution of the 
problem and discuss why it fails. In section ITTT1 the one-step replica-symmetry breaking (RSB) solution is presented. 
From this we derive the average graph connectivity for the q-COL/UNCOL transition, and we demonstrate the 
existence of a dynamic threshold associated with the appearance of solution clusters in configuration space. Then, in 
the next section we show that the previous ideas are valid even in the analysis of single-case instances. This allows, 
in section^ to propose an algorithm that colors, in the hard region, single instances in polynomial time. Finally, in 
Sec. IVII some conclusions of the work are presented. 
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II. REPLICA SYMMETRIC SOLUTION 

As stated above, the question if a given graph is g-colorable is equivalent to the question if there are zero-energy 
ground states of the anti-ferromagnetic g-state Potts model defined on the same graph. Denoting the set of all edges 
by E, the problem can thus be described by the Hamiltonian 

where {di} G {1,2, ...,<?} are the usual Potts spins, and 5(-, •) denotes the Kronecker symbol. This Hamiltonian 
obviously counts the number of edges being colored equally on both extremities, a proper coloring of the graph thus 
has energy zero. Since this Hamiltonian cannot take negative values, the combinatorial task of finding a coloring is 
translated to the physical task of finding a zero-energy ground state, i.e. to the statistical physics of the above model 
at zero temperature. 

A. The cavity equation 

In this paper we therefore apply the cavity method in a variant recently developed for finite-connectivity graphs 
directly at zero temperature [23.1231 124| . This approach consists of a self-consistent iterative scheme which is believed 
to be exact over locally tree-like graphs, like G(N, c/N), the set we consider here. It includes the possibility of dealing 
with the existence of many pure states. One has to first evaluate the energy shift of the system due to the addition 
of a new new spin a®. Let us assume for a moment that the new spin is only connected to a single spin, say ax, in 
the pre-existing graph. Before adding the new site 0, the ground-state energy of the system with fixed <j\ can be 
expressed as: 

E N (a 1 )=A-jhhl5(T,<j 1 ) (2) 

T=l 

where we have introduced the effective field fti = (ft 1 , ■■■,h\) and used the superscript N to stress that the previous 
quantity refers to the N -sites systems. Note that a (q — l)-dimensional field would be sufficient since one of the q 
fields above can be absorbed in A. We, however, prefer to work with q field components in order to keep evident the 
global color symmetry. When we connect uo to <7i we can express the minimal energy of the N + 1-sites graph at 
fixed Co, f i & s: 

i 

E N+1 (a , a 1 )=A-Y J h\5(T, o\ ) + S(a , a ± ) (3) 

T = l 

The minimum energy for the N + 1-sites system at fixed do is thus obtained by minimizing E N+1 ((To, <j\) with respect 
to c7i, it can be written as: 

E N+1 (a ) = mmE N+1 (<j ,cj 1 ) = A-w{h x ) - V u T (hx)d(T, a ) (4) 

with 

oj(h) = -mm{-h\... 1 -h q ) (5) 

u T (h) = -mm(-h\...,-h T + l,...,-h q )-oj(h) (6) 

where we have introduced the cavity biases u{h). This choice of to and u is not unique but, according to the previous 
discussion, we have chosen the only manifestly color-symmetric notation. The structure of the cavity biases is easily 
understood if we distinguish among two different cases: 

(i) h T > ft 1 , . . . , h T -\h T+1 , ■ ■ ■ , hi for some r : Then u T = -1 and u a = for all ct^t. 



(ii) h T1 = h T2 > ft 1 , . . . ,h q for some ti,t 2 : Then u = (u 1 , ...,u q ) = (0, 0, . . . , 0) = 0. 
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FIG. 1: Replica-symmetric order parameter 77 vs. average connectivity c for q = 3, from Eq. I|14^ 



Only vectors h with non-degenerate maximal component give rise to non trivial cavity bias u in the direction of the 
minimal component. This is physically understandable: A unique maximal field component in h fixes the correspond- 
ing color, which thus is forbidden to the newly added site. If there are two or more maximal field components, the 
color of the old site is not fixed, thus there cannot be any forbidden color for the new vertex. Each cavity bias in our 
problem thus belongs to the q+ 1-elements set {0, e\, . . . , e 9 }, where e T has all components but the r th equal to — 1. 

If the new spin ctq is connected to k randomly chosen sites with fields hi, . . . , h/., the cavity bias has to be linearly 
superposed and the resulting cavity field on vertex is given by ho — X)»=i u{hi). With high probability (tending to 
one for large N) the k sites will be far from each other in the original graph: Although an extensive number of loops 
is surely present for c > 1 0, these loops have lengths of the order log AT. Inside one Boltzmann state we can thus 
invocate the clustering propriety resulting in a statistical independence of the k selected sites and there cavity 
fields hi (for a more detailed discussion of this point see |23ll24j ). The simplest ansatz assumes that there exists just 
one such state (or a finite number, like in ferromagnets at low temperature), which is equivalent to the Bethe-Peierls 
iterative scheme or the replica-symmetric ansatz in the replica method. Assuming further-on the existence of a well 
defined thermodynamic limit of the energy density E/N and of the probability distributions of local fields (for recent 
rigorous studies in this direction see [25II26I |27 | ) . the distribution of the fields ho of the newly added vertex becomes 
the equal to those of the k neighbors. It is consequently determined by the closed expression 



00 k f ^ 

P{h) = e~ c ]T|y d«h 1 ,...,d«ti n P(h 1 )...P(ti n )S(ti~Y<^( Jl ^ 

k=0 ' t=0 



Q(u) = / d q hP(h)6{u-u(h)) 



(7) 

(8) 
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where we have already used that the connectivities k are distributed according to a Poissonian of mean c. The previous 
equations can be combined in order to have a closed form for the Q(u): 

oo k „ k 

Q{u) = e~ c ^2 77 ]\ dq u l Q(u l )5(u-u(u 1 +u 2 + --- + u k )). (9) 

fc=0 ' J i=l 

From the symmetry of our model under arbitrary permutations of the colors we conclude that 

Q(ei) = Q(e 2 ) = • • • = Q(e q ) = v and Q(0) = 1 - q (10) 

i.e. we need a single real number r\ with < r\ < - to completely specify the probability distribution function Q{u). 
Noting that the probability P^ (h) for a site with k neighbors can be expressed by 

p( k )(h)= I " Y[d q UiQ(ui)S(h - ( n ) 

•* i=l t=l 

and recalling that Hi € {0, e\, . . . , e q } it is easy to rewrite this probability distribution in a compact multinomial form 

P w ( h) = pW (h i h*)= - y-z-^ii-w)"^-^ (12) 

with the convention that 1/n! = for n < 0. Note that h T G (0, — 1, . . . , — k) and that there are correlations among 
the different components of the cavity fields such that P( k > (h 1 , . . . , h q ) ^ I1t=i ^P{h T ). Now we are ready to calculate 
the graph average over the Poissonian connectivity distribution of mean c, 

p{h\ h") = e- c ^^(ft 1 ,..., h«) = e-^ n = n V ^ ^ 

k ' T = l \ >' T—l 

It is interesting to note that after the average the correlations among the different colors disappear and P is the 
product of q Poissonian distributions with average crj. From Eq. JSJl it is possible to derive a self-consistence equation 
for the order parameter noting that the probability rj to obtain a non-trivial cavity bias - say e T - is simply given by 
the probability that the r th component of the local field is the non-degenerate smaller, so setting r = 1 

i-f t - t 1* w-^t^h-ygfflT 1 (") 

/ii=0fi 2 =/il+l h"=hl+l n=0 V V ' / 

where T(n, x) is the incomplete Gamma function defined from the following useful relation 

k=n y ' 

The sum in Eq. I|14fl converges very fast. It is therefore easy to numerically construct a solution to this equation as 
a function of c. For q > 2 it turns out that rj jumps discontinuously from zero to a finite value as shown in Fig. 
where the order parameter rj jumps at c = 5.141 in the case of q — 3. 

This means that, up to c = 5.141 and at the level of the replica symmetric assumption, we only find the paramagnetic 
solution 7] = 0. The solution rj > would account to a spontaneous breaking of this symmetry, there should be a 
finite number of pure states (similar to Neel order in Ising anti-ferromagnets). 

B. The calculation of the energy 

One can easily compute the average shift in the ground state energy when a new spin is added to the iV-sites system 
and it is connected to k spins of the system. The energy of the original graph is given by A — u{hi) while the 

energy of the N + 1-sites system is A — 2j=i[ a; (^i) + w ("(^i))]- Therefore the average shift is given by 

A-Ei = - jr e~ c c k /k\ J d q ui . . . d q u k Q(ui) . . . dQ{u k ) oj uA = - J d q h P{h) w(h) . (16) 

fc=0 \i=l / 
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FIG. 2: Energy density e vs. average connectivity c for q — 3 in the RS approximation from Eq. (1201 

One might be tempted to conclude that Eq. I|16fl equals the energy density of the system, at least for N large enough, 
but this is not true. There is a correction term due to the change in the number of links per variable in the iteration 
N — » N + 1. In fact, generating links with probability c/N in a N + 1 system, instead of c/(N + 1) we are slightly 
over-generating links. So, we need to calculate the average energy shift in a system when two sites - say spins o\ and 
(J2 - are joined by an anti-ferromagnetic link. 

Again, the energy of the original graph is A — uj(h{) — w(/i2), while the energy after the two spins are joined is given 
by A — min CT1)(72 (— h^ 1 — h!^ 2 + °2))- The difference between the two contributions can be written as 

A£ii n k = min(— h^ 1 + min(— h^ 2 + S(ai, a 2 ))) + uj(hi) + uj(h 2 ) 
= min(-^ 1 - u ai (h 2 ) - uj(h 2 )) + cj(^i) + uj{h 2 ) 

= -u{h x + u(h2)) + u(Hi) (17) 
This allows as to express the average link-energy shift as 

AE 2 = J <Ph x d q h 2 P{hx)P{h 2 ) - oj(hi + u{h 2 ))) (18) 

It is interesting to observe that Eqs. (|16f) and (|17|) are model-independent, in the sense that the actual Hamiltonian is 
encoded into the functions uj(h) and u(h) defined by Eq. (JSJ. 
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Using Eq. JSJ and i|13|) one shows easily that Eq. (|16(l reduces to: 

AE, = J2 'Pc V (h 1 )...Vcr l (h q )mm(-h\-h 2 ,...,-h«) 

b>...hi 

q— 1 * \ — oo / — oo \ 

= -£ L! a E^w 9 " a E^fo) as) 

a=0 ^ y ' h=0 \9<h J 

It is also not hard to prove that the average link-energy shift AE2 — qi] 2 . This result can be obtained cither by direct 
computation of the integral, or following a simple probabilistic argument: AE^^ is different from zero whenever the 
two unlinked sites have the same color, but this happens with probability rj 2 for each of the colors. Finally we have 
the the following equation for the energy which is equivalent to the replica-symmetric approximation: 

g—i / \ —00 /— 00 \ 

E = N [AE X ^AE 2 ) = - E I £ h Vc V {h) q - a £?%(<?) ' \rf (20) 

a=0 ^ ' h=0 \g<h J 

The behavior of the energy for q = 3 as a function of the average connectivity c is displayed in Fig. . Let us note 
that for average connectivity 5.141 < c < 5.497 the energy is negative, a particularly baffling result if we consider that 
the Hamiltonian is at least positively defined. This phenomenon is analogous to what already observed for the RS 
approximation in random 3-SAT 38], and is a consequence of the approximation used. We will see in the next section 
how the 1-RSB ansatz cures this pathology. However, before leaving the RS approximation we would like to compare 
our RS results with the RS approximation presented recently by van Mourik and Saad in [2l| since their result clearly 
differs from ours. At the origin of the discrepancy is the fact that they work a population dynamics at very low but 
finite temperature finding a transition around c = 5.1 but without negative energy region. Analogously to what is 
reported in j^, if one works directly a zero temperature the distribution P(h) must be concentrated around integer 
field components, but this is not true anymore at temperature different from zero, as it happens in pi) . They find 
that, in the zero-temperature limit, there remain non-integer field components. In our opinion these are a direct hint 
to the existence of RSB. Instead of including these fields into an extended replica-symmetric approach, we directly 
switch to a replica-symmetry broken solution. 



III. 1 STEP RSB SOLUTION 

The RS results show some evident pathologies and are at odd with numerical simulations [T^. l2l) which predict a 
lower threshold around c = 4.7, and with the rigorous upper bound c = 4.99 20] . What can be wrong in our analysis? 
The main assumption we have made is the statistical independence of the of the k cavity fields. Is it true that long 
distance among spins imply statistical independence? In general the answer we obtain from statistical mechanics is 
no: The assumption holds only inside pure states. 

In this section we will focus on how the cavity method could be used to handle a situation in which there exist 
many different pure states. More precisely we assume that their number M oc e NK is exponentially large in N . The 
connectivity-dependent exponent £ is called complexity and denotes the entropy density of clusters. Note that it differs 
in general from the solution entropy since each cluster may contain as well an exponential number of solutions. The 
first basic assumption we made is that inside each pure state the clustering condition holds. Under this assumption 
the iteration can still be applied but we have to take into account the reshuffling of energies of different states when 
new spins are added. 



A. 1 RSB cavity equation 

We proceed following the same steps of the previous section. Let us take the new spin and let us connect it to 
k spins cti, . . . ,<7fc in the same state a. Thanks to the fast decrease of correlations inside a pure state the energy of 
state a for fixed value of the k spins is 

fc q 

Ea fa, • ■ ■ , o-fe) = A a - E E h U S <T> ^) (21) 
i=l t=1 
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The optimization step within each pure state a runs still in close analogy to the RS computation: when we connect 
oq to (7i, . . . , Ofc, we express the minimal energy of the N + 1-sites graph with fixed a®, by minimizing the N + 1-sites 
system at fixed <tq is thus obtained by minimizing E^ +1 with respect to the k spins: 

k k q 

< +1 (a ) = A a - u(Ki, a ) ~ E E u t (K*)Ht, <to) (22) 

1 — 1 i — 1 T — 1 

This last equation shows that the local field acting on the new spin Co m the state a is 

k 

h ,a = E H h i,a) (23) 
t=l 

and that the energy shift inside a state is 

k 

A£ a = -^w(it(y) (24) 

z=l 

All the previous equation are completely equivalent to those in the RS case except for the fact that now we have a 
a-index labeling the different pure states. One natural question is how cavity fields and the related cavity biases are 
distributed for a given site among the different pure states. This leads us to the notion of survey *- e - the 

site dependent normalized histogram over the different states of both cavity biases and cavity fields: 

1 M 

Qi(ui) = — ^ 6(uj - Uj, a ) 
a — 1 

1 M 

Pi(hi) = -j^^2S(hi-hi ta ) (25) 

a — 1 

In close analogy with what we have already done in the RS case, the existence of a well defined thermodynamic limit 
implies that there must exist unique functional probability distributions Q[Q{u)\ and V[P{u)\ for all the surveys. One 
may wander how could we handle such a big functional space: Fortunately the Q-surveys are described in terms of a 
single real number < r\ < 1/q, cf. Eq. H10|l. and scalar function p(rf) is enough for specifying their distribution: 



Q[Q(u)} = / dr, p( V ) 5 



Q(u) - (1 - qr))6(u) -ri22s(u-e T ) 



(26) 



with <$[■] denoting a functional Dirac distribution. Assuming that the survey of site is distributed equally to those 
of all its k neighbors, we can write: 

00 k r k 

P {h) = ^ c Y.T\ Gk \ d q tiiQi(ui)---d q u k Q k (u k )e y ^=i a > ) S(h~Y,^) (27) 

k=Q ' J z=l 

Q {u) = J d q hP (h)8(u-u{h)) (28) 

Note the presence of the reweighting factor exp(yw(^^ =1 Ui j) that arise after conditioning the probability distributions 
of the hs to a given value of energy |23|. the prefactors C k are normalization constants depending on Qi(u), Qk{u). 
The reweighting parameter y is a number equal to the derivative of the complexity S(e) of metastable states with 
respect to their energy density e = E/N: 

Intuitively, this reweighting factor can be understood as a penalty e v AEa one has to pay for positive energy shifts. 
Note that Eqs. and J2H1) can be cast in the following form 

„k 



Qo(uo) = e" c E|[ Cfc /^iQi("i)---^feQfe(^)e ? '" (l:tl " l) ^o-^i + --- + ^)) 

k=Q ' 

= e- c jr^C k J d«hP(h) e y^5(u ~u(h)) (30) 
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In the last line we have introduced the auxiliary distribution P(h) which would result in Eq. i|27|) without reweighting 
(i.e. by setting y = 0). It has no direct physical meaning in this context, but it will be of great technical help in the 
following calculations. 

Let us first concentrate on the colorable phase, where the ground states are proper q-colorings and have zero energy. 
Consequently no positive energy shifts are allowed, so this phase is characterized by y — oo. Let us first calculate the 
value of the normalization constants Ck in this limit. Note that uj(h) < for all allowed h (each component of h is 
non-positive as h results from a sum over us). This means that the only surviving terms in Eq. i|30|) are those with 
zero energy shift ui(h) = 0, i.e. all fields must have at least one zero component, allowing for the selecting of at least 
one color without violating an edge. Let us first specialize to the case q = 3 for clarity, the generalization to arbitrary 
q is straightforward. Summing over u both sides of Eq. (|3U|) we have: 

^- = P(0 1 ) 0) + 3^P(/ l 1 ,0 ) 0) + 3 P(h\h\0) (31) 

where the combinatorial factors 3 appearing in the r.h.s. are obtained by noting that P(h, 0, 0) = P(0, h, 0) = P(0, 0, h) 
and that P{h}, h 2 , 0) = P{h x , 0, h 2 ) = P(0, h 1 , h 2 ). Combining Eqs. ffT jl .ffl fy and JTOJ) we get 

(32) 
(33) 



£(0,0,0) 


k 

= na 

1=1 


- %Vi) 






k 

= na 

i=l 


- 2%) 


k k 

- P(0, 0, 0) = [[(I - 2th) - II( 1 - 

i=l i=l 


£ p(h\h 2 ,o) 

ft 1 ,/t 2 <0 


k 
i=l 


- Vi) - 


2 £ P(fc 1 ,0,0)-P(0,0,0) 




k 

= na 

1=1 


- Vi) - 


fc fe 
•2jJ(l-2 W )+JJ(l-3» ?i ) 

i=l j=l 



( 34 ) 

i=l j=l 

Plugging these relations into Eq. l|3*T|l we finally get 

i =3n(l-^)-3n(l-2ry l ) + II( 1 - 3 ^) ( 35 ) 

2=1 2=1 2=1 

Note also that in close analogy to the analysis that lead to Eq. I|14fl . we can interpret l|34f) as the (un- normalized) 
probability of having the survey in site pointing in direction e^. Therefore combining Eqs. (|34|l and (|35|l we obtain 



„ - f (n n) - ntia - vi) anti(i - 2 ^) + nti(i - ^> ,^ 

Vo- fk{Vi,---,Vk) - fc — — — — ^ — (36) 

3Jl J =i( 1 - W - 3]l 4 =i( 1 - 277,) + ll l =i( 1 - H 

At this point we are ready to write the 1-RSB iterative equation for the Q-surveys 

00 Jfe 



c f 

P(v) = e~ c £ y / drjtpirn) ■ ■ ■ dn kP {r] k ) 5(r) - f k (vi, ■ ■ ■ , Vk)) (37) 
fc=0 ' J 

Eq. i|36|) can be easily generalized to an arbitrary number q of colors, 

/.(„,..,»)- Ea( _ 1)Ifc , i)nti[1 _ + 1)t)(] (38) 

The self-consistent equation <|37|) resembles a replica-symmetric equation and can be solved numerically using a 
population dynamic algorithm: 

(i) Start with an initial population rji, t^tk of size M. which can be easily chosen to be as large as 10 6 to generate 
high-precision data. 

(ii) Randomly draw a number k from the Poisson distribution e~ c c k /k\; 
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FIG. 3: Probability distribution function p(rf) in the case q = 3 for average connectivities 4.42 < c < 4.69. Note also that a 
delta peak in 77 = is always present (but not displayed here). 



(iii) Randomly select k + 1 indices irj, ii, ik from { 1 , . . . , Ai } ; 

(iv) Update the population by replacing r] io by fdiVh^ ■••) J 7ijJ; 

(v) GOTO (ii) until convergence of the algorithm is reached. 

One obvious solution of Eq. I|37|) and (|38|) is the paramagnetic solution S(rf). For small average connectivities c it 
is even the only one. The appearance of a non-trivial solution coincides with a clustering transition of ground states 
into an exponentially large number of extensively separated clusters. In spin-glass theory, this transition is called 
dynamical. Still, p(r/) will contain a non-trivial peak in r\ = due to small disconnected subgraphs, dangling ends, 
low-connectivity vertices etc. The shape of p{rf) in the case q = 3 is displayed in Fig. for connectivities c ranging 
from Cd to c c . 

The weight t of this peak can be computed self-consistently. Let us again consider first the case q = 3. Keeping in 
mind that for y — > 00 the field h has at least one vanishing component, the only possibilities to obtain u(h) = are 
given by h = or by a field h with one single non-zero component. So the probability that the cavity field acting on 
a given site with k neighbors equals zero is given by the sum of the probabilities that all neighboring cavity fields are 
zero (equal to t k ), plus the probability that exactly one cavity bias among the k is non-trivial (equal to k{l — i)£ fc_1 ). 
The average over the Poissonian degree distribution leads to 

OC 

t = e- c ^y {t k + fct*- 1 ^ - t)) =e- (1 - t)c (l + (l-t)c) (39) 

fc=0 
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Generalizing Eq. (|39|l to a general number q of colors easily gives 

t'^Enr- ( 4 °) 

1=0 

This equation is quite interesting, since a non-trivial solution forms a necessary condition for Eq. 1)37(1 to have a 
non-trivial solution. In fact, this equation was first found in [lfij . the fraction of edges belonging to the g-core is given 
by (1 — t m i n ) with t m i n being the smallest positive solution of Eq. (|40l) . Thus, we also find that the existence of an 
extensive g-core is necessary for a non-trivial p(rf), and forms a lower bound for the g-COL/UNCOL transition. 

Unlike in the case of finite-connectivity p-spin-glasses or, equivalently, random XOR-SAT problems [2H Isfl Isij. 
the existence of a solution t < 1 is not sufficient for a non-trivial p(j]) to exist. The latter appears suddenly at the 
dynamical transition Cd, which can be determined to high precision using the population dynamical algorithm. This 
solution does not imply uncolorability, but the set of solutions is separated into an exponentially large number of 
clusters. The number of these clusters, or more precisely its logarithm divided by the graph size N, is called the 
complexity £ and can be calculated from p{rf). 



B. The calculation of energy and complexity 



More generally, we expect also a large number of metastable states at non-zero energy to exist. Hereafter we will 
assume that they are exponentially many, N(e) cx exp(iVE(e)), where the complexity S(e) is (despite of the use of a 
capitol letter!) an intensive function of the energy density e = E/N. We can introduce a thermodynamic potential 
<Kv) M as 

0(„) = _ J_ In ( J de e N{-ve+s(e)}^ ( 41 ) 
For large N, we calculate this integral by its saddle point: 

4>{y) = min (e - -S(e)) = e sp - -E(e sp ) (42) 

It is easily verified that the potential 4> calculated at the saddle point energy e sp (y) fulfills the usual Legendre relations: 

d y [y<j)(y,e sp (y))} = e sp (43) 
y 2 d y (f>(y,e sp (y)) = S(e sp ) (44) 

Around the saddle point the complexity can be approximated, according to equation (|42|l . by 

£(e) ~ S(e sp ) + y(e - e sp ) = -y<j>(y) + ye (45) 

We will now consider a cavity argument: let us denote by En the energy of a system composed of N sites, the density 
of configurations is given by 

dN N (E N ) cx e- y *"M +vBir dE N (46) 

with $jv(y) denoting the extensive thermodynamic potential with limit $jv(y)/A^ — » (f>(y). Now we add a spin to the 
system. If we consider that the total energy is Ejq + \ = En + A.E, we can express the density of configurations in 
terms of En and AE: 

dAf N+1 (E N , AE) oc e v^N(.y)+y(E N +AE) ^ P(AE) dAE . (47) 

Integrating over 5E we get 

dJV N+1 (E N+1 ) = Ce- y * N +^ y '* +yEN + l dE N+1 (48) 
C = - [ P(AE)ey AE dAE=-{ e y AE ) P(AE) (49) 

yj y 

Comparing the previous equations with 14611 we can deduce that 

$N+i(y) = ®n(v) - - ln(exp (yAE)) P{AE) . (50) 

y 
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In the thermodynamic limit we can thus identify 

4>{y) = — ln(cxp {yAE)) P(AE) (51) 

y 

In close analogy with what we have already done in the RS case, and using Eq. I|24|) . we can compute ^ as a site 
contribution plus a link contribution in the 1-RSB scenario: 

• Site Addition 

exp ( - yAtpi) = J d q u. ll Q ll (u ll )---d q u lk Q lk (u lk )exp u^) j = ^- (52) 

• Link Addition 

exp ( - yA(j) 2 ) = I d q hi 1 P il (h il ) d q h ik Pi 2 (hi 2 ) exp ( - ya^KiJ + ya^K^ + w(fri 2 ))) 

d'/iPi! (h)d q uQ l2 (u) exp[-y(tjj(h) - u;(h + u))} 

= l + qm^ie-y -I) (53) 

Note that in the limit y — > and assuming Pi — P for each site, we obtain the RS expressions. Once the functional 
distributions Q[Q(u)} and 7 ? [P(/i)] are known we can eventually average the energy shifts A<f>i, A(f) 2 in the usual 
linear combination: 



4>(y) = Afa ~ -A0 2 (54) 

where the over-lines denote the average over both disorder and functional distributions. One finally finds 

1 °° c k r k ( r k ( k \\ 

<Kv) = —J2 e ' C Ti / TiVQMQil ln / n&UiQiiui) exp yu^*) + 

y k=i fc - J t=o y t=o \ i=\ ) ) 

I VPlV[Pl] VWm^U d q hiPi(hi) d q h 2 P 2 (h 2 )exp^yiu(h 1 )-yLj(h 1 +u{h 2 ))^ (55) 

In the limit y — > oo these relations can be written in a more explicit form. Let us consider first the term A<f>i in 
Eq. Referring to Eq. l|3*5*|) it easy to see that: 

9-1 / 



lim e-y** = B- 1 )' h 1 ! - (I + I),,] (56) 

i=o ^ ' i=l 

such that 

lim = J] e~ cC - / dpfa) . . . dp(n k ) ln £(-1)' « ) JJ[1 - (I + 1) % ] . (57) 

In order to compute the average link contribution A<fi 2 (y) we need to evaluate the large y limit of Eq. I|53(l which 
gives: 

lim -yA$ 2 (y) = / dp(rji)dp(r] 2 ) ln (1 - qrjirj 2 ) (58) 

y^oo J 

This equation has a nice probabilistic interpretation complementary to that used in the derivation of AE 2 in the RS 
case. In fact the integrand of l|53[) is different from zero for y — > oo only when both sites i\ and i 2 have a different 
color, and this happens with probability (1 — qrj^rj^) (note that qij^rj^ is the probability that the two sites have same 
color). It is now clear from Eq. <|42|l that taking the y — > oo of — y$(y) gives us the complexity at least in the COL 
region where e = 0: 

oo k r /9 _ i r \ k \ 

£(e = 0) = E e ~ C T7 d Vl p( m )...d Vk p( Vk )ln £(-!)'( J ) U 1 " + ^ 

k=l ' J \l=Q ^ ' t=l / 



dr)ip(r)i)dr) 2 p(r)2) In (1 - g)]!^) (59) 
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C. Results 



The previous analysis results for the q-coloring problem in the existence of a dynamic transition, characterized by 
the sudden appearance of an exponential number of clusters that disconnect the solutions of the problem. This is 
represented in figure0]for q = 3 and 4, where the complexity is plotted as a function of the graph connectivity. Note, 
that at a certain value average connectivity c = Cd the complexity abruptly jumps from zero to a positive value. Then 
it decreases with growing c and disappears at c q where the number of solutions become zero. It is not possible any 
more to find a zero-energy ground state for the system, i.e. the graph becomes uncolorable with q colors, and its 
chromatic number grows by one. 



8 



10 



q=3 



6 7 

c 




8 



10 



FIG. 4: Top: Complexity E(c) vs. average connectivity for q = 3 and q — 4. Non-zero complexity appears discontinuously at 
the dynamical threshold ca, and goes down continuously to zero at the q-COL/UNCOL transition. The curves are calculated 
using the population-dynamical solution for p(»y) with population size M = 10 6 . 

Bottom: The full line shows the chromatic number of large random graphs vs. their connectivity c. The symbols give results 
of smallk for N = 10 3 , each averaged over 100 samples. 

In the following table, we present the results for q — 3, 4, and 5, for the dynamical transition we show the correspond- 
ing values of c d , of the entropy s(c<j) = lng + c d ln(l — l/q)/2 (33 and the complexity £(cd). For the q-COL/UNCOL 
transition, the critical connectivity c q and the solution entropy are given. Like in random 3-satishability j3j| and 
vertex covering |3^| . this entropy is found to be finite at the transition point. 
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FIG. 5: Average thermodynamic potential <j>(y) vs. y in the HARDCOL phase (c = 4.5) and in the UNCOL phase (c = 5.0). 
Note that 4>{y) above the paramagnetic region ( (j> — ) is a monotonously increasing function of y in the first case, while it 
displays a maximum at finite y in the second one. 



q 


Cd 


s(cd) 


E(d) 


c q 


s(c q ) 


3 


4.42 


0.203 


0.0223 


4.69 


0.148 


4 


8.27 


0.197 


0.0553 


8.90 


0.106 


5 


12.67 


0.196 


0.0794 


13.69 


0.082 



In Fig. JBJ we display the average complexity E as a function of the energy density e in the 1-RSB approximation. 
Recently Montanari and Ricci showed in that in the p-spin spherical spin glass the 1-RSB scheme is correct 
only up to a certain critical energy density e<3, above which this solution become unstable and a FRSB calculation 
is required. It is possible that such a phenomenon might happen also in this case. The dynamical transition is not 
only characterized by a sudden clustering of ground states, at the same point an exponential number of meta-stable 
stains of positive energy appears 0. Such states (besides algorithm-dependent entropic barr iers which may exist 
even below Cd) are expected to act as traps for local search algorithms causing an exponential slowing down of the 
search process. Well known examples of search processes that are overwhelmed by the presence of excited states are 
simulated annealing or greedy algorithms based on local information. 

To test this prediction, we have applied several of the best available solvers for Coloring and SAT problems available 
in the net [g, l37| . After some preliminary simulations we observed that the best results could be obtained with the 
smallk program |37| and concentrated our efforts on it. The simulation results, as shown in the lower half of Fig. 

were obtained in the following way: First a random graph (N = 10 3 ) was generated and we tried to color it 
with a small number of colors (here q = 3). If, after some cutoff time (we probed with 10 seconds, 1 minute and 2 
minutes without substantial changes), the graph was not colored, we stop and tried to color it with larger q. For each 
connectivity we averaged over 100 samples. As it can be clearly seen, the algorithm fails with q colors slightly below 
the dynamical transition, confirming our expectations. In Sec. HVI we explain how the cavity approach helps to design 
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FIG. 6: Average complexity E as a function of the energy density e for various average connectivities c. In this figure we only 
display the physical branches (see text). 



an algorithm being able to deal also with this problem. 



D. The large-g asymptotic 



From Eqs. (|S7jl and one can easily deduce the large-g asymptotics of p(rj). For average connectivities c >> q 
(the threshold c q is expected to scale like 0(q\nq)), fk is dominated by the I = O-contributions in the numerator and 
in the denominator, leading to p(?y) = S(ri — 1/q) in leading order. Plugging this result into the Eq. I|59|) one can 
easily calculate the COL/UNCOL threshold c q by setting the complexity to zero. Taking care only of the dominant 
contribution we find 



c q = 2qlnq + o(qhiq) . (60) 

This result coincides with the exact asymptotics found by Luczak |l ll . Note, however, that the same dominant term 
can also be obtained from the vanishing of the replica-symmetric (paramagnetic) entropy s(c) which is expected to 
be exact up to the COL/UNCOL transition. This means that, for q — > oo, the threshold entropy goes to zero. This 
behavior could already be conjectured from the above table where the threshold entropies are given for small q. The 
derivation of the sub-dominant terms in Eq. l|tjU|) requires a much more detailed analysis and goes beyond the scope 
of this paper. It will be presented in a future publication together with analogous results for i^-SAT |36| . 
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IV. WORKING WITH SINGLE GRAPH INSTANCES: SURVEY PROPAGATION 



Up to now we have solved analytically the coloring problem averaged over the set of Erdos-Renyi graphs at given 
average connectivity. In this way we derived the g-dependent threshold connectivities of c q at which the graph becomes 
almost surely uncolorable with q colors, i.e. the location of the COL/UNCOL transition. We have also demonstrated 
the existence of another threshold value Cd above which a clustering phenomenon takes place in the space of solutions. 

However, one of the relevant consequences of this cavity approach is that it can be naturally implemented to study 
single case instances, i.e. specific non-random graphs which have to have a locally tree-like structure to fulfill the 
conditions of the cavity approach. In the average-case analysis at each step of the iteration, we selected randomly k 
sites from the M possible ones to be used in Eq. (|36|) . and we substituted another randomly chosen entry 770 from 
the M. possible entries. From here on, we will assume that the iteration procedure used above is also valid for single 
instances - with one significant change: For the generation of survey for one vertex (or edge) we have to use its actual 
neighbors, the connections between sites are fixed once for ever by the specific graph under consideration. 



A. The survey propagation algorithm 

This algorithm works in a way similar to the sum product algorithm ^3). In the latter, to each vertex arrive 
u- messages from k — 1 neighbors, then this messages are transformed (become ^.-fields) and sent as a new message 
through the link to the descendant k neighbors. So, at each time step, in the links of the graphs you will have 
messages traveling, like in a communication network. The survey propagation algorithm (SP), works with the same 
principle. The basic difference is that now the messages are replaced by u — surveys of the messages (i.e. by probability 
distributions of messages). SP is defined for one given value of the reweighting parameter y that must be optimized 
to minimize the "free energy" of the system. To each edge {i,j} of the graph we associate two u-surveys Qi^j{u) and 
Qj^i(u) of messages traveling in the two possible directions. The algorithm self-consistently determines these surveys 
by a message passing procedure to be described below, and finds consequently all the thermodynamic properties of 
the model defined on the specific graph. Let us now describe how SP works in practice for the 3-coloring problem: 

1. Select a graph G = (V,E). 

2. All the Qi— t j{u) with {i,j} G E are randomly initialized. 

3. We sequentially consider all sites i and randomly update the links {i,j} to all neighbors j in the following way: 

(a) For each neighbor j of i we calculate: 



l\i(J>) = C'ij I [ II d q u k Q k ^{u k ) 

kev(i)\j 



<H h - ^ u k \ exp < y ui \ ^ "fc ) \ (61) 
kev(i)\j J I \kev(i)\j 



where with the symbol V (i) denotes all neighbors of i. The prefactor Cju is chosen such that Pju is properly 
normalized to one. 



(b) From Pju(/i) we derive the new u-surveys of all edges {i, j}: 

Qi^j(u) = j d q h Pi\j(h)S (u - u{h) 



(62) 



4. The iteration step 3. is repeated until convergence is reached. 
It was already shown in [24| that the free energy of the system may be written as: 



1 

TV 



[i,j}eE i 



(63) 



where rn is the connectivity of the vertex i, and 0'™ fe (?/) and </)™ ode (y) represent the contributions of links and vertices 
which are given by: 



= ~~ hi (J d q hP lU (h) d?uQ^i{u) exp [-y [u{h) - to(h + u)] } 



(64) 
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FIG. 7: Free energies cj> as a function of j/ for three given samples of N = 10000 of connectivities c = 4.60, 4.69, 4.80. 

and 



€ ode (y) = - 




Repeating the above procedure for various values of y, Eqs. I|64f> and I|rj5|) do not only provide the values of <fi(y), 
but also E(y) = —y 2 d(j)(y)/dy and the energy density e(y) = d{y<&{y)) / dy of states. The parametric plot of S(y) 
versus e(y) gives the complexity of states as a function of their energy. For example, Fig. @ shows the free energy 
4>{y) of single graphs with N — 10000 vertices as a function of y for three different values of the average connectivity 
c. 

We observe that for high enough connectivities the maximum of 4>{y) is located at finite values of y. While decreasing 
c, the location of the maximum grows and approaches y — » oo at the coloring threshold. From these curves and by 
means of numerical derivatives, we may also calculate the complexity and energy. Fig. JSJl shows the two branches 
obtained in the parametric plot of S(y) vs. e{y) for various connectivities c. While the physical meaning of the 
upper branches is not clear |23| we wanted to stress that they interpolate between the RS solution and the maximum 
complexity point. 

From the previous figure we may extract two characteristic values of the energy: The first one, is associated with 
the minimal number e^N of miscolored edges in the graph, i.e. it gives the ground state energy of the instance. The 
value of e g s is determined as the positive point where the lower branch of the complexity curve intersects the energy 
axis, or it equals zero if E(e = 0) > on the lower branch. 

The other relevant energy value is the threshold energy eth- It is determined by the point where the complexity 
reaches its maximum. It is therefore the point where e.g. simulated annealing gets stuck. The same remark of 
Sec. (|III C|l holds here: this calculation should be probably improved along the line of [35| in order to take into 
account the FRSB instability at higher energy density as in the case of the p-spin spherical model. 
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FIG. 8: Complexity E as a function of e for three given samples of random graph with average connectivities c = 4.60, 4.69, 4.80 
and N = 10000 sites. At odd with Fig. JSJ here we display both physical and unphysical branches. 

From the practical side this is of course not the way to determine this values, it is much more desirable to look for 
the value of y at which <fi(y) becomes maximal, cf. Eq. <|44|) . Fig. shows a plot of these two energies as a function 
of the connectivities obtained using this single-instance algorithm. 

Of course, the exact meaning of the numerical values of these quantities is an open question. In principle they 
were defined for infinite systems, whereas our single-instance algorithm works for systems of finite sizes N. We expect 
that the numerical values give good approximations once we look to large values of N, where e.g. the scales dividing 
distances of solutions inside one state from those between states are well separated. A more detailed discussion about 
this may be found in 0, 0] • 

V. A POLYNOMIAL ALGORITHM TO COLOR GRAPHS 

The Survey Propagation described above was very useful for the design of an efficient algorithm to find a solution 
of randomly generated 3-SAT formulas [22H24| in the hard but satisfiable phase. Here we will show that, with small 
modifications, the same idea can be extended to the q-coloring problem. 

The relevant idea in this algorithm is to fix spins which are strongly biased toward (or away from) one color. 
Therefore, we have to first determine the distributions of local magnetic fields in the system using SP, and select 
those which have the strongest bias. Once these are fixed, the problem is reduced. We can rerun SP on the reduced 
instance, new spins may be biased and fixed. The procedure will be repeated until only paramagnetic spins remain. 
At this point SP cannot help any more, but surprisingly the decimated coloring problem becomes "easy" . Using any 
reasonable local solver known in the literature, we can proceed to construct a proper coloring. 

In the case of g-COL the subject is technically slightly more complex than in K-SAT, since spins can be biased in 
q different directions and it is hard to decide what do we mean precisely by biased. In addition, by fixing the color of 
one vertex, all its neighbors have to have different colors, i.e. they are left with q — 1 colors. In the reduction process 
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FIG. 9: Density of miscolored links e gs vs. average connectivity of the graph c (lower dotted curve) and threshold energy 
density e t h vs. c (upper continuous curve) in the 1-RSB approximation. 



the problem, initially being a pure g-coloring problem, becomes a list coloring problem where each vertex has an own 
list of allowed colors. In this way the permutation symmetry of colors is broken, which requires a modification of the 
SP given above to non-symmetric surveys. 

In order to keep the presentation as simple as possible we concentrate our efforts on the 3-coloring problem and 
hence, from now on, all the discussion will be associated for the case q = 3. The extension of the results to higher q 
is, however, straightforward although exponential in q. 

As mentioned above, the first things we should do are a generalization of SP to non-color-symmetric situations, 
and to correctly define a biased spin. Let us start first noting that equation l|26|l may be written as 



Q[Q(u)] = / d^rfp(ff) 5 



Q(u) - r] S{u) - V T 5(i 



(66) 



where we simply avoid to consider the color symmetry of the problem, and where we introduce 77° = (1 — X)r=i 
Then, following the same lines of reasoning that lead from l|26f) to (|36[) we may deduce the following update of the 
surveys in the limit y — > 00: 



11— *3 



p=l,2,3 Yiki£V(i)/j 



(Vk^i + rfk^i) + TlkeV(i)/j Vk^i 



(67) 



for r 6 {1,2,3}. The value of can be calculated by imposing the normalization condition. Using this update 
rule instead of the one proposed in the above version of SP, we directly work with a reweighting parameter y = 00 
which forbids any positive energy changes and thus characterizes proper colorings. 
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Having r/J, for all the sites of the graph, we have to define the site dependent color polarizations 

Ep=l,2,3 E[jey(i)(l ~ rfj-fi) ~ Z)p=l,2,3 rijey(j) + + IIfcey(») 'Tj'-h 

for r = 1, 2, 3. This equation is analogous to Eq. (|67J) but the products are extended to all neighbors. The polarization 
11^ is the probability that vertex i is fixed to color r in a randomly selected cluster of solutions. Vertices which may 
change their color within one cluster are characterized by II? = (1 — Y) r — i H£ ). Once these polarizations are known, 
many strategies can be adopted for coloring the graph. We believe that the simplest and most intuitive one is the 
following: 

(i) If one spin is very biased to one color, fix that spin and remove it from the graph. Forbid this color to all 
neighbors. 

(ii) If the bias of one spin toward some color is very low, forbid that color. 

Forbidding a color c to a node i implies rewritting Eq. I|67|) using only two colors for that particular node. This 
can be achieved simply by takin Eq. I|67|) and i|68[l but setting jy? , = and = 1 for all k € V(i). 

Furthermore, during the processes discussed above, it turned out, that many vertices get surrounded by neighbors 
with fixed colors. In that case, the spin can be fixed to one of their remaining allowed colors immediately, and it is 
removed from the graph. 

In practice, we put a cutoff for the value of the bias to be used for the previous criteria. We use rule (i) every 
time a bias towards some color is greater than 0.8 and rule (ii) if the bias was lower than 0.15. There is no special 
reason for selecting specifically these values, but we found numerically a fast convergence to solvable paramagnetic 
problem instances. It could be useful to make a systematic analysis for improving this choice, and also to discuss other 
selection rules. However, this is not the objective of the present work. Here we just want to demonstrate that the 
algorithm works substantially better than every other local search algorithm we know, even without any parameter 
optimization. 

Summarizing, the discussion above, our algorithm follows the next steps 

1. Take the original graph and run SP in its infinite-?/ version defined by Eq. (|67H . 

2. Calculate the biases of all spins according to (|68fl . 

3. Select spins whose bias to one color is larger than 0.8, fix and remove these spins from the graph. Forbid the 
color to all neighbors. 

4. Select spins whose bias to one color is lower than 0.15 and forbid that color to these spins. 

5. Take all spins where just one color is allowed, fix these spins, and remove then from the graph. Forbid the fixed 
color to all neighbors. 

6. If the the graph is not completely paramagnetic: rerun SP and go to 2. 

7. Run any smart program that solves the coloring sub-problem. 

Actually, we did not find any free program in the web which was able to easily handle large graphs for the coloring 
problem. The best we could find was the smaZ/fc-program by Culberson [37]], but even in the easy region it exploded 
in memory for graphs with sizes larger than N = 2000. So step 7 above was changed into: 

7.1 Transform the resulting graph into a satisfiability problem. 

7.2 Run walk-SAT |6j on this satisfiability problem. 

An interesting point about the algorithm described above is the fact that we can fix a certain percentage of spins 
in every algorithmic step, without rerunning SP every time. This drastically reduces the computational time. How 
many spins we may fix, depends in a non-trivial way on the system size and on the distance from the COL/UNCOL 
transition. 

Figure ITUI shows the success rate of our algorithm in 3-coloring random graphs in the hard region c 6 [4.42,4.69]. 
From left to right the sample sizes increase: N = 4*10 3 , 8*10 3 , 16*10 3 , 32*10 3 and 64*10 3 . In all the cases we fixed 
the 0.5 percentage of the spins in every iteration step. Note that keeping this value fix we find a clear improvement of 
the algorithm for sizes going from TV = 4 * 10 3 , 8 * 10 3 to N = 16 * 10 3 the performance is roughly the same for larger 
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FIG. 10: Probability of coloring a graph using our algorithm for different lattice sizes. From left to right N — 4 * 10 3 , 8 * 10 3 
16 * 10 3 , 32 * 10 3 and 64* 10 3 . 

lattices suggesting that we should reduce the fraction of spins to fix. However, note that even within these conditions 
the algorithm works quite well in the hard region of the system. Note, that strong finite size effects are present, in 
fact the algorithm doesn't behave very well for small lattice sizes. Two reasons may explain this: First there are short 
short loops that disappear in the thermodynamic limit, second there is some shift in the location of the COL/UNCOL 
transition towards higher connectivities for larger graphs. This point should be clarified in a fortcoming work. 

Another relevant feature of the curve is the following: The closer our graph is to the critical point, the smaller is 
also the percentage of spins we may fix in one algorithmic step. However, extrapolating the results, the worst we 
can find is to fix only one single spin at the time. This would change the complexity of our algorithm from N\nN 
(resulting from sorting spins with respect to their biases) to TV 2 , i.e. the algorithm remains polynomial. 

VI. CONCLUSIONS AND OUTLOOK 

In this work we presented a detailed derivation of the one-step replica-symmetry broken solution of the coloring 
problem on random graphs. The problem consists in finding a coloring of all vertices of the graph such that no two 
adjacent vertices carry equal colors. From the average case point of view, the one-step RSB approach allowed to 
determine the g-COL/UNCOL transition c q for arbitrary color numbers q. This means that large random graphs of 
average connectivity below c q have proper g-colorings with high probability (approaching one for N — > oo), whereas 
graphs with higher connectivity require more colors for a proper coloring. Moreover, we find the existence of a 
clustering transition in the colorable region. This transition is characterize by the appearance of an exponential 
number of states separated by large energetic barriers. The clustering transition is accompanied by the sudden 
appearance of an exponential number of metastable states that - intuitively - cause local algorithm to get stuck. 

We also extended our results to the study of single case instances, i.e. specific realizations of random graphs, 
showing that the previous analysis remains valid. With this understanding we also implemented a new algorithm, 
based on the idea of a survey propagation that enabled us to solve the coloring problem in the hard clustering region 
in polynomial time. We present results for sizes as large as N ~ 10 5 vertices, which is far beyond the performance of 
other algorithms on random graphs. 

There are many interesting direction to extend this work. A first one concerns the survey-propagation algorithm. We 
were able to report quite encouraging results for the SP inspired graph reduction procedure if applied to the clustered, 
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i.e. hard but colorable phase on random graphs. These graphs are characterized by a locally tree-like structure, loop 
are of length O(lniV). This structure allowed us to use the statistical independence of surveys restricting a randomly 
selected vertex inside each pure state. This assumption fails, however, if the graph has some non-trivial local structure 
as given by small loops, small highly connected subgraphs etc. Before being of real practical value, SP should be 
extended to such situations, following e.g. the lines used by Yedidia et al. in [Hi in their generalization of belief 
propagation to locally non-treelike graphs. 

A second possible extension of our work concerns the interpretation of colorings as ground-states of a Potts- 
antiferromagnet, which is a model known to show glassy behavior at low temperatures (the so-called Potts-glass), see 
e.g. In the present work we have directly worked at zero temperature, but the extension to non-zero temperature 
is straight-forward. In this context it is interesting to see that for q = 3 a continuous full replica-symmetry breaking 
transition appears at the level of fields of 0(T) - before the one-step solution appears for fields of 0(1). So we expect 
that the one-step RSB transition in this model exists in a strict sense only at zero temperature, in temperature it 
is only a (sharp) cross-over to glass-like behavior. This phenomenon disappears for larger q, but it is interesting in 
how far it can influence the usual glassy phenomenology known from fully-connected spin-glass models. Let us also 
point out that interestingly enough a similar scenario holds also in the random K-SAT [42| case. Using in addition 
the approach suitable for single graph instances, one can, e.g., study inhomogeneities arising in the glassy phase |43| 
and thus go beyond the usual paradigm of disorder averaged results for randomly disordered models. 
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